120 research outputs found

    A systematic search for SNPs/haplotypes associated with disease phenotypes using a haplotype-based stepwise procedure

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genotyping technologies enable us to genotype multiple Single Nucleotide Polymorphisms (SNPs) within selected genes/regions, providing data for haplotype association analysis. While haplotype-based association analysis is powerful for detecting untyped causal alleles in linkage-disequilibrium (LD) with neighboring SNPs/haplotypes, the inclusion of extraneous SNPs could reduce its power by increasing the number of haplotypes with each additional SNP.</p> <p>Methods</p> <p>Here, we propose a haplotype-based stepwise procedure (HBSP) to eliminate extraneous SNPs. To evaluate its properties, we applied HBSP to both simulated and real data, generated from a study of genetic associations of the bactericidal/permeability-increasing (BPI) gene with pulmonary function in a cohort of patients following bone marrow transplantation.</p> <p>Results</p> <p>Under the null hypothesis, use of the HBSP gave results that retained the desired false positive error rates when multiple comparisons were considered. Under various alternative hypotheses, HBSP had adequate power to detect modest genetic associations in case-control studies with 500, 1,000 or 2,000 subjects. In the current application, HBSP led to the identification of two specific SNPs with a positive validation.</p> <p>Conclusion</p> <p>These results demonstrate that HBSP retains the essence of haplotype-based association analysis while improving analytic power by excluding extraneous SNPs. Minimizing the number of SNPs also enables simpler interpretation and more cost-effective applications.</p

    A statistical method for predicting splice variants between two groups of samples using GeneChip(® )expression array data

    Get PDF
    BACKGROUND: Alternative splicing of pre-messenger RNA results in RNA variants with combinations of selected exons. It is one of the essential biological functions and regulatory components in higher eukaryotic cells. Some of these variants are detectable with the Affymetrix GeneChip(® )that uses multiple oligonucleotide probes (i.e. probe set), since the target sequences for the multiple probes are adjacent within each gene. Hybridization intensity from a probe correlates with abundance of the corresponding transcript. Although the multiple-probe feature in the current GeneChip(® )was designed to assess expression values of individual genes, it also measures transcriptional abundance for a sub-region of a gene sequence. This additional capacity motivated us to develop a method to predict alternative splicing, taking advance of extensive repositories of GeneChip(® )gene expression array data. RESULTS: We developed a two-step approach to predict alternative splicing from GeneChip(® )data. First, we clustered the probes from a probe set into pseudo-exons based on similarity of probe intensities and physical adjacency. A pseudo-exon is defined as a sequence in the gene within which multiple probes have comparable probe intensity values. Second, for each pseudo-exon, we assessed the statistical significance of the difference in probe intensity between two groups of samples. Differentially expressed pseudo-exons are predicted to be alternatively spliced. We applied our method to empirical data generated from GeneChip(® )Hu6800 arrays, which include 7129 probe sets and twenty probes per probe set. The dataset consists of sixty-nine medulloblastoma (27 metastatic and 42 non-metastatic) samples and four cerebellum samples as normal controls. We predicted that 577 genes would be alternatively spliced when we compared normal cerebellum samples to medulloblastomas, and predicted that thirteen genes would be alternatively spliced when we compared metastatic medulloblastomas to non-metastatic ones. We checked the consistency of some of our findings with information in UCSC Human Genome Browser. CONCLUSION: The two-step approach described in this paper is capable of predicting some alternative splicing from multiple oligonucleotide-based gene expression array data with GeneChip(® )technology. Our method employs the extensive repositories of gene expression array data available and generates alternative splicing hypotheses, which can be further validated by experimental studies

    A class of models for analyzing GeneChip(® )gene expression analysis array data

    Get PDF
    BACKGROUND: Various analytical methods exist that first quantify gene expression and then analyze differentially expressed genes from Affymetrix GeneChip(® )gene expression analysis array data. These methods differ in the choice of probe measure (quantification of probe hybridization), summarizing multiple probe intensities into a gene expression value, and analysis of differential gene expression. Research papers that describe these methods focus on performance, and how their approaches differ from others. To better understand the common features and differences between various methods, and to evaluate their impact on the results of gene expression analysis, we describe a class of models, referred to as generalized probe models (GPMs), which encompass various currently available methods. RESULTS: Using an empirical dataset, we compared different formulations of GPMs, and GPMs with three other commonly used methods, i.e. MAS 5.0, dChip, and RMA. The comparison shows that, on a genome-wide scale , different methods yield similar results if the same probe measures are chosen. CONCLUSION: In this paper we present a general framework, i.e. GPMs, which encompasses various methods. GPMs permit the use of a wide range of probe measures and facilitate appropriate comparison between commonly used methods. We demonstrate that the dissimilar results stem primarily from different choice of probe measures, rather than other factors

    Accurate, precise modeling of cell proliferation kinetics from time-lapse imaging and automated image analysis of agar yeast culture arrays

    Get PDF
    BACKGROUND: Genome-wide mutant strain collections have increased demand for high throughput cellular phenotyping (HTCP). For example, investigators use HTCP to investigate interactions between gene deletion mutations and additional chemical or genetic perturbations by assessing differences in cell proliferation among the collection of 5000 S. cerevisiae gene deletion strains. Such studies have thus far been predominantly qualitative, using agar cell arrays to subjectively score growth differences. Quantitative systems level analysis of gene interactions would be enabled by more precise HTCP methods, such as kinetic analysis of cell proliferation in liquid culture by optical density. However, requirements for processing liquid cultures make them relatively cumbersome and low throughput compared to agar. To improve HTCP performance and advance capabilities for quantifying interactions, YeastXtract software was developed for automated analysis of cell array images. RESULTS: YeastXtract software was developed for kinetic growth curve analysis of spotted agar cultures. The accuracy and precision for image analysis of agar culture arrays was comparable to OD measurements of liquid cultures. Using YeastXtract, image intensity vs. biomass of spot cultures was linearly correlated over two orders of magnitude. Thus cell proliferation could be measured over about seven generations, including four to five generations of relatively constant exponential phase growth. Spot area normalization reduced the variation in measurements of total growth efficiency. A growth model, based on the logistic function, increased precision and accuracy of maximum specific rate measurements, compared to empirical methods. The logistic function model was also more robust against data sparseness, meaning that less data was required to obtain accurate, precise, quantitative growth phenotypes. CONCLUSION: Microbial cultures spotted onto agar media are widely used for genotype-phenotype analysis, however quantitative HTCP methods capable of measuring kinetic growth rates have not been available previously. YeastXtract provides objective, automated, quantitative, image analysis of agar cell culture arrays. Fitting the resulting data to a logistic equation-based growth model yields robust, accurate growth rate information. These methods allow the incorporation of imaging and automated image analysis of cell arrays, grown on solid agar media, into HTCP-driven experimental approaches, such as global, quantitative analysis of gene interaction networks

    Sequencing genes in silico using single nucleotide polymorphisms

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The advent of high throughput sequencing technology has enabled the 1000 Genomes Project Pilot 3 to generate complete sequence data for more than 906 genes and 8,140 exons representing 697 subjects. The 1000 Genomes database provides a critical opportunity for further interpreting disease associations with single nucleotide polymorphisms (SNPs) discovered from genetic association studies. Currently, direct sequencing of candidate genes or regions on a large number of subjects remains both cost- and time-prohibitive.</p> <p>Results</p> <p>To accelerate the translation from discovery to functional studies, we propose an in silico gene sequencing method (ISS), which predicts phased sequences of intragenic regions, using SNPs. The key underlying idea of our method is to infer diploid sequences (a pair of phased sequences/alleles) at every functional locus utilizing the deep sequencing data from the 1000 Genomes Project and SNP data from the HapMap Project, and to build prediction models using flanking SNPs. Using this method, we have developed a database of prediction models for 611 known genes. Sequence prediction accuracy for these genes is 96.26% on average (ranges 79%-100%). This database of prediction models can be enhanced and scaled up to include new genes as the 1000 Genomes Project sequences additional genes on additional individuals. Applying our predictive model for the KCNJ11 gene to the Wellcome Trust Case Control Consortium (WTCCC) Type 2 diabetes cohort, we demonstrate how the prediction of phased sequences inferred from GWAS SNP genotype data can be used to facilitate interpretation and identify a probable functional mechanism such as protein changes.</p> <p>Conclusions</p> <p>Prior to the general availability of routine sequencing of all subjects, the ISS method proposed here provides a time- and cost-effective approach to broadening the characterization of disease associated SNPs and regions, and facilitating the prioritization of candidate genes for more detailed functional and mechanistic studies.</p

    Gene expression profiling identifies genes predictive of oral squamous cell carcinoma

    Get PDF
    Oral squamous cell carcinoma (OSCC) is associated with substantial mortality and morbidity. To identify potential biomarkers for early detection of invasive OSCC, we compared gene expression of incident primary OSCC, oral dysplasia, and clinically normal oral tissue from surgical patients without head and neck cancer or pre-neoplastic oral lesions (controls), using Affymetrix U133 2.0 Plus arrays. We identified 131 differentially expressed probe sets using a training set of 119 OSCC patients and 35 controls. Forward and stepwise logistic regression analyses identified 10 successive combinations of genes which expression differentiated OSCC from controls. The best model included LAMC2, encoding laminin gamma 2 chain, and COL4A1, encoding collagen, type IV, alpha 1 chain. Subsequent modeling without these two markers showed that COL1A1, encoding collagen, type I, alpha 1 chain, and PADI1, encoding peptidyl arginine deiminase, type 1, also can distinguish OSCC from controls. We validated these two models using an internal independent testing set of 48 invasive OSCC and 10 controls and an external testing set of 42 head and neck squamous cell carcinoma (HNSCC) cases and 14 controls (GEO GSE6791), with sensitivity and specificity above 95%. These two models were also able to distinguish dysplasia (n=17) from control (n=35) tissue. Differential expression of these four genes was confirmed by qRT-PCR. If confirmed in larger studies, the proposed models may hold promise for monitoring local recurrence at surgical margins and the development of second primary oral cancer in OSCC patients

    Genomewide gene expression profiles of HPV-positive and HPV-negative oropharyngeal cancer: potential implications for treatment choices.

    Get PDF
    OBJECTIVE: To study the difference in gene expression between human papillomavirus (HPV)-positive and HPV-negative oral cavity and oropharyngeal squamous cell carcinoma (OSCC). DESIGN: We used Affymetrix U133 plus 2.0 arrays to examine gene expression profiles of OSCC and normal oral tissue. The HPV DNA was detected using polymerase chain reaction followed by the Roche LINEAR ARRAY HPV Genotyping Test, and the differentially expressed genes were analyzed to examine their potential biological roles using the Ingenuity Pathway Analysis Software, version 5.0. SETTING: Three medical centers affiliated with the University of Washington. PATIENTS: A total of 119 patients with primary OSCC and 35 patients without cancer, all of whom were treated at the setting institutions, provided tissues samples for the study. RESULTS: Human papillomavirus DNA was found in 41 of 119 tumors (34.5%) and 2 of 35 normal tissue samples (5.7%); 39 of the 43 HPV specimens were HPV-16. A higher prevalence of HPV DNA was found in oropharyngeal cancer (23 of 31) than in oral cavity cancer (18 of 88). We found no significant difference in gene expression between HPV-positive and HPV-negative oral cavity cancer but found 446 probe sets (347 known genes) differentially expressed in HPV-positive oropharyngeal cancer than in HPV-negative oropharyngeal cancer. The most prominent functions of these genes are DNA replication, DNA repair, and cell cycling. Some genes differentially expressed between HPV-positive and HPV-negative oropharyngeal cancer (eg, TYMS, STMN1, CCND1, and RBBP4) are involved in chemotherapy or radiation sensitivity. CONCLUSION: These results suggest that differences in the biology of HPV-positive and HPV-negative oropharyngeal cancer may have implications for the management of patients with these different tumors

    Genetic Variation of the Human Urinary Tract Innate Immune Response and Asymptomatic Bacteriuria in Women

    Get PDF
    BACKGROUND:Although several studies suggest that genetic factors are associated with human UTI susceptibility, the role of DNA variation in regulating early in vivo urine inflammatory responses has not been fully examined. We examined whether candidate gene polymorphisms were associated with altered urine inflammatory profiles in asymptomatic women with or without bacteriuria. METHODOLOGY:We conducted a cross-sectional analysis of asymptomatic bacteriuria (ASB) in 1,261 asymptomatic women ages 18-49 years originally enrolled as participants in a population-based case-control study of recurrent UTI and pyelonephritis. We genotyped polymorphisms in CXCR1, CXCR2, TLR1, TLR2, TLR4, TLR5, and TIRAP in women with and without ASB. We collected urine samples and measured levels of uropathogenic bacteria, neutrophils, and chemokines. PRINCIPAL FINDINGS:Polymorphism TLR2_G2258A, a variant associated with decreased lipopeptide-induced signaling, was associated with increased ASB risk (odds ratio 3.44, 95%CI; 1.65-7.17). Three CXCR1 polymorphisms were associated with ASB caused by gram-positive organisms. ASB was associated with urinary CXCL-8 levels, but not CXCL-5, CXCL-6, or sICAM-1 (P< or =0.0001). Urinary levels of CXCL-8 and CXCL-6, but not ICAM-1, were associated with higher neutrophil levels (P< or =0.0001). In addition, polymorphism CXCR1_G827C was associated with increased CXCL-8 levels in women with ASB (P = 0.004). CONCLUSIONS:TLR2 and CXCR1 polymorphisms were associated with ASB and a CXCR1 variant was associated with urine CXCL-8 levels. These results suggest that genetic factors are associated with early in vivo human bladder immune responses prior to the development of symptomatic UTIs

    Comprehensive Analysis of HLA-A, HLA-B, HLA-C, HLA-DRB1, and HLA-DQB1 Loci and Squamous Cell Cervical Cancer Risk

    Get PDF
    Variation in human major histocompatibility genes may influence the risk of squamous cell cervical cancer (SCC) by altering the efficiency of the T-cell–mediated immune response to human papillomavirus (HPV) antigens. We used high-resolution methods to genotype human leukocyte antigen (HLA) class I (A, B, and Cw) and class II (DRB1 and DQB1) loci in 544 women with SCC and 542 controls. Recognizing that HLA molecules are codominantly expressed, we focused on co-occurring alleles. Among 137 allele combinations present at >5% in the case or control groups, 36 were significantly associated with SCC risk. All but one of the 30 combinations that increased risk included DQB1*0301, and 23 included subsets of A*0201-B*4402-Cw*0501-DRB1*0401-DQB1*0301. Another combination, B*4402-DRB1*1101-DQB1*0301, conferred a strong risk of SCC (odds ratio, 10.0; 95% confidence interval, 3.0–33.3). Among the six combinations that conferred a decreased risk of SCC, four included Cw*0701 or DQB1*02. Most multilocus results were similar for SCC that contained HPV16; a notable exception was A*0101-B*0801-Cw*0701-DRB1*0301-DQB1*0201 and its subsets, which were associated with HPV16-positive SCC (odds ratio, 0.5; 95% confidence interval, 0.3–0.9). The main multilocus associations were replicated in studies of cervical adenocarcinoma and vulvar cancer. These data confirm that T helper and cytotoxic T-cell responses are both important cofactors with HPV in cervical cancer etiology and indicate that co-occurring HLA alleles across loci seem to be more important than individual alleles. Thus, certain co-occurring alleles may be markers of disease risk that have clinical value as biomarkers for targeted screening or development of new therapies
    corecore